首页> 外文OA文献 >Beyond Word Embeddings: Learning Entity and Concept Representations from Large Scale Knowledge Bases
【2h】

Beyond Word Embeddings: Learning Entity and Concept Representations from Large Scale Knowledge Bases

机译:超越Word嵌入:学习实体和概念表示   大规模知识库

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Text representation using neural word embeddings has proven efficacy in manyNLP applications. Recently, a lot of research interest goes beyond wordembeddings by adapting the traditional word embedding models to learn vectorsof multiword expressions (concepts/entities). However, current methods arelimited to textual knowledge bases only (e.g., Wikipedia). In this paper, wepropose a novel approach for learning concept vectors from two large scaleknowledge bases (Wikipedia, and Probase). We adapt the skip-gram model toseamlessly learn from the knowledge in Wikipedia text and Probase conceptgraph. We evaluate our concept embedding models intrinsically on two tasks: 1)analogical reasoning where we achieve a state-of-the-art performance of 91% onsemantic analogies, 2) concept categorization where we achieve astate-of-the-art performance on two benchmark datasets achieving categorizationaccuracy of 100% on one and 98% on the other. Additionally, we present a casestudy to extrinsically evaluate our model on unsupervised argument typeidentification for neural semantic parsing. We demonstrate the competitiveaccuracy of our unsupervised method and its ability to better generalize to outof vocabulary entity mentions compared to the tedious and error prone methodswhich depend on gazetteers and regular expressions.
机译:使用神经词嵌入的文本表示已在许多NLP应用中被证明是有效的。近来,通过使传统的词嵌入模型适应以学习多词表达的向量(概念/实体),许多研究兴趣已经超出了词嵌入。但是,当前的方法仅限于文本知识库(例如Wikipedia)。在本文中,我们提出了一种从两个大型知识基础(维基百科和Probase)中学习概念向量的新颖方法。我们改编了跳过语法模型,以从Wikipedia文本和Probase概念图中的知识中无缝学习。我们从本质上在两个任务上评估我们的概念嵌入模型:1)在逻辑推理中我们达到了91%的语义类似度的最新性能; 2)概念分类在其中我们实现了两个方面的最新性能基准数据集在一个类别上实现100%的分类精度,在另一个方面实现98%的分类精度。此外,我们提供了一个案例研究,以对神经语义分析的无监督参数类型识别进行外在评估。与依赖于地名词典和正则表达式的乏味且容易出错的方法相比,我们证明了我们无监督方法的竞争准确性以及其更好地泛化到词汇实体提及之外的能力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号